Search CORE

953 research outputs found

PerformanceNet: Score-to-Audio Music Generation with Multi-Band Convolutional Residual Network

Author: Wang Bryan
Yang Yi-Hsuan
Publication venue
Publication date: 11/11/2018
Field of study

Music creation is typically composed of two parts: composing the musical score, and then performing the score with instruments to make sounds. While recent work has made much progress in automatic music generation in the symbolic domain, few attempts have been made to build an AI model that can render realistic music audio from musical scores. Directly synthesizing audio with sound sample libraries often leads to mechanical and deadpan results, since musical scores do not contain performance-level information, such as subtle changes in timing and dynamics. Moreover, while the task may sound like a text-to-speech synthesis problem, there are fundamental differences since music audio has rich polyphonic sounds. To build such an AI performer, we propose in this paper a deep convolutional model that learns in an end-to-end manner the score-to-audio mapping between a symbolic representation of music called the piano rolls and an audio representation of music called the spectrograms. The model consists of two subnets: the ContourNet, which uses a U-Net structure to learn the correspondence between piano rolls and spectrograms and to give an initial result; and the TextureNet, which further uses a multi-band residual network to refine the result by adding the spectral texture of overtones and timbre. We train the model to generate music clips of the violin, cello, and flute, with a dataset of moderate size. We also present the result of a user study that shows our model achieves higher mean opinion score (MOS) in naturalness and emotional expressivity than a WaveNet-based model and two commercial sound libraries. We open our source code at https://github.com/bwang514/PerformanceNetComment: 8 pages, 6 figures, AAAI 2019 camera-ready versio

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Affective Music Information Retrieval

Author: Wang Hsin-Min
Wang Ju-Chiang
Yang Yi-Hsuan
Publication venue
Publication date: 18/02/2015
Field of study

Much of the appeal of music lies in its power to convey emotions/moods and to evoke them in listeners. In consequence, the past decade witnessed a growing interest in modeling emotions from musical signals in the music information retrieval (MIR) community. In this article, we present a novel generative approach to music emotion modeling, with a specific focus on the valence-arousal (VA) dimension model of emotion. The presented generative model, called \emph{acoustic emotion Gaussians} (AEG), better accounts for the subjectivity of emotion perception by the use of probability distributions. Specifically, it learns from the emotion annotations of multiple subjects a Gaussian mixture model in the VA space with prior constraints on the corresponding acoustic features of the training music pieces. Such a computational framework is technically sound, capable of learning in an online fashion, and thus applicable to a variety of applications, including user-independent (general) and user-dependent (personalized) emotion recognition and emotion-based music retrieval. We report evaluations of the aforementioned applications of AEG on a larger-scale emotion-annotated corpora, AMG1608, to demonstrate the effectiveness of AEG and to showcase how evaluations are conducted for research on emotion-based MIR. Directions of future work are also discussed.Comment: 40 pages, 18 figures, 5 tables, author versio

arXiv.org e-Print Archive

CiteSeerX

The Effectiveness of Using Cloud-Based Cross-Device IRS to Support Classical Chinese Learning

Author: Wang Yi-Hsuan
Publication venue
Publication date
Field of study

[[abstract]]The purpose of the present study was to examine the effects of integrating a cloud-based cross-device interactive response system (CCIRS) on enhancing students¡¦ classical Chinese learning. The system is a cloud-based IRS system which provides instructors and learners with an environment in which to achieve immediate interactive learning and discussion in the classroom. A quasi-experimental design was employed in which the experimental group (E.G.) learned classical Chinese with the system, while the control group (C.G.) followed their original learning method. The results revealed that the novice and medium-achievement learners in the E.G. performed significantly better than other E.G. students, and most students as well as the instructor gave positive feedback regarding the use of the system for course learning. In sum, CCIRS is an easy-to-use learning trigger that encourages students to participate in activities, arouses course discussion, and helps to achieve students¡¦ social and self-directed learning. The study concludes that the idea of ¡¥bring your own device¡¦ could be implemented with this system, while integrating educational factors such as game-based elements and competitive activities into the response system could reinforce flipped classroom learning.[[notice]]補正完

Tamkang University Institutional Repository

A Preliminary Study of Integrating Flipped Classroom strategy for Classical Chinese Learning

Author: Wang Yi-Hsuan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

[[abstract]]This is a multiphase study which aims to investigate how to provide learners with an method to acquire classical Chinese through integrating mobile technology with the flipped classroom approach. Currently, in the first phase of study, the researcher adopts informant design through questionnaire survey to understand students' and instructors' perceptions of using mobile learning devices for classical Chinese learning, and afterwards the researcher constructs the system based on the pilot results. The pilot questionnaire results, structure of the developed mobile learning system and the practical application of the developed system for classical Chinese teaching and learning are described in the paper.[[notice]]補正完

Tamkang University Institutional Repository